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Abstract 

Beginning vith the results of Girshick on the asymptotic distribution 
of principal component loadings and those of Lawley on the distribution of 
unrotated maximum likelihood factor loadings, the asymptotic distributions 
of the corresponding analytically rotated loadings is obtained. The 
principal difficulty is the fact that the transformation matrix which 
produces'" the rotation is usually itself a function of the data. The ap- 
proach is to use implicit iifferentiation to find the partial derivatives 
of an arbitrary orthogonal rotation algorithm. Specific details are given 
for the orthomax algorithnis and an example involving maximum likelihood 
estimation and varimax rotation is presented. 



STANDARD ERRORS FOR ROTATED FACTOR LOADINGS'^ 
1> Introduction 

VJhile its proponents have never questioned its importance, many stat- 
isticians are surprised to learn that factor analysis is one of the most 
popular methods of statistical investigation. An extensive computer usage 
survey at UCLA found regression analysis, discriminant analysis, and fac- 
tor analysis the three most popular statistical methodologies. Informal 
inquiries at other institutions indicate that more often than not, f-.ctor 
analysis ranks in the top three. Computer programs for factor analysis are 
unusual, hovrever, in that they give no standard errors for the estimates 
they produce. This is due in large measure lo the fact that until now 
formulas for the standard errors of estimates of rotated factor loadings, 
the estimates which constitute the primary output of standard factor 
analysis programs, have not been produced. In tvro important papers Lawley 
[1955, 1967] identified the asymptotic standard errors of the umrotated 
loadings produced in maximum likelihood factor analysis. Similar results 
for principal components analysis \rere given some time ago by Girshick 
[1959]. The difficulty in extending these results to the case of rotated 
loadings is that the transformation matrix T v/hich produces the rotation 
is usually derived from the data. It is perhaps not surprising that 
Lawley and Maxwell [1971] state, "It v/ould be almost impossible to take 
sampling errors in the elements of T into account. The only course is, 
therefore, to ignore them in the hope that they are relatively small." 

"^This research was supported in part by NIH Grant FR-5« The authors 
are grateful to Mrs. Dorothy T. Thayer who implemented the algorithms dis- 
cussed here as well as those of Lawley and Maxi^ell. We are particularly 
indebted to Dr. Michael Browne for convincing us of the significance of 
this v/ork and for helping to guide its development. 
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We shall show that these sampling errors can in fact be taken into account. 
This is important because Wexler [1968] has produced some evidence to 
indicate they cannot be'aafely ignored. 

We begin with a p by k factor loading matrix A = (a. ) and an 

ir ' 

asymptotically normal estimate A = (a. ) . By this we mean that as the 
size n of the sample on which A is based approaches infinity, the dis- 
tribution of a/tT (A - A) approaches multivariate normal with mean zero. 
The maximum likelihood estimates in the classical factor analysis model 
and the principal components estimates in principal components analysis 
both have this property. We are not concerned at this point with specifi- 
cally which estimates are being considered, but rather with the effect of 
an orthogonal rotation algorithm on the asymptotic distribution of A . 
We have in mind algorithms such as quartimax, varimax, and equimax, but 
for the present let h denote pn arbitrary orthogonal rotation algorithm. 
Specifically h is a function which maps an arbitrary p by k matrix X 
into a p by k matrix Y = XT where T is an orthogonal matrix whose 
value may, and generally will, depend on X . We are interested in the 
asymptotic distribution of A = h(A) = (A^^) . in particular if 
A = h(A) = (A^^) we would like to conclude that n/tT (A - A) is asmptoti- 
cally normally distributed and to find its asymptotic covariance matrix. 
This we shall do. 

2. The Asymptotic Distribution of Orthogonally Rotated I/?adings 

In principle at least our task is quite simple. Let dh be the 
differential of h at A . Then 
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(1) ^ (A - A) I dh( vlT (A - A)) 

where " = " is read "is asymptotically equal to'' and means that the dif- 
ference between the left and right sides oi' (1) approaches zero in proba- 
bility as n ->oo , This is the basis of the 6 -m<^thod as discussed by 
Rao [1965, p. 321]- Since dh is a linear -'sraris formation^ and A is an 
asymptotically normal estimator of A ^ A is an asj-mptotically normal 
estimator of A whose asymptotic covariance matrix may be obtained fiom 
dh and the asymptotic covariance matrix of A . To be more explicit^ let 
the differential of the relation A = h(A) = (h^^) be expressed by a 
formula of the form 



oh. 

(2) dA. ^ E. ^ da. 

ir js da. 



Then the as;j/mptotic covariances of the \^ may be expressed in term:3 of 
those of the a^^^ by means of the formula 

(5) acov(A A ) = z -r-^ acov(a ,a ) • 

ir 3S mnuv 00 ^ m\y nv' da 

ma nv 

For rotation algorithms of interest^ quartimax^ varimax^ and eq-iimax^ 
it is difficult to find dh or^ eq Aivalently^ the partial derivatives 

^^Ir^^^^s ^i^^^^l^* O^ir approach is to use implicit differentiation. 
Suppose that 

(h) >Ka) = 0 



is a k(k - l)/2 dimensional constraint which is satisfied whenever A 
is an h -rotation of A no matter what the value of A • Constraints 
of this form for rotation algorithms of interest will be foiind in Section 
5« By differentiating the relations 

(5) A = AT , T'T = I , ijr(A) = 0 
one obtains 

(6) dA = dAT + AdT 

(7) dT'T + T'dT = 0 

(8) dijr (dA) = 0 

where dijr denotes the differential of ijr at A . It follows .from (?) that 
T'dT is a skew-symmetric k by k matrix. Let -R denote the space of 
all such matrices. It has dimension k(k - l)/<2 . Moreover, for each 
K € Klet 

(9) L(K) = dilr(AK) . 

Then L is a linear transformation from a k(k - l)/2 dimensional space 
into a k(k - l-)/2 dimensional space which we assume is invertible. This 
is usually the case for constraint functions ^ of interest. Since T'dT 
is skew-symmetric it is in the domain of L and using (9) and (5) in order 
gives 

(10) L(T'dT) = d>lf(AT'dT) -- d^(AdT) . 

Substituting (6) into (8) and using the linearity of d'jr shows that 
dt(AdT) - -dt(dAT) so that from (lO), 



(11) L(T'dT) = d)|/(AdT) = -di|r(dAT) . ^ 
Thus 

(12) T'dT = L'-^Cdi^AdT)] = -L"-^[d)|/(dAT) ] 

Multiplying on the left by AT and using (5) and (6) gives the basic 
relation 

(13) d^^ = i^T - AL"''"[dTlr(dAT)] 

which expresses dA in terms of dA and defines the differential of h 

at A . It also defines the required partial derivatives of h . All that 

is needed for a particular rotation algorithm is to find an appropriate 

constraint function \1/ and to recover the partial derivatives bh. 

ir' OS 

from (15 )• The first task will be relatively easy. The second is a little 
harder. 

3« Constraints for an Orthogonal Algorithm 
Orthogonal rotation algorithms are designed to optimize a criterion: 
(Ik) Q = Q(a) - Q(AT) 

over all orthogonal k by k niatrices • The resulting A = AT is called 
the 0 -rotation of A . In the case of quartimax^ varimax^ and equimax 
rotation^ 0 is a quart ic function of A . In target rotation^ on the 
other hand^ Q is quadratic. There is^ hov/ever^ no need to specialize at 
this point. An arbitrary Q is considered here and in the next section. 
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Assume that A = AT optimizes Q and let dQ denote the differen- 
tial of Q at A . It is necessary that 

(15) dQ(AdT) = 0 

for all dT which satisfy (?)• Since T'dT is skew-symmetric^ it is 
necessary that 

(16) dQ(AK) = 0 

for all K c y{ • Let K(r^s) be the elementary k by k skew-symmetric 
matrix which has the value 1 in row r and column s ^ the value -1 in 
row s and column r , and is zero elsewhere • Replacing K in (l6) by 
K(r^s) and writing the result in coordinate form gives: 

1=1 IS ir 

for 1 £ s < k . These are the constraints which are needed. We ob- 
serve that the matrix ^ = ) is skew-symmetric for arbitrary A . 

r s 

One may view (17) slightly differently. Let |^ denote the p by k 
matrix (^O/^A^^) of partial derivatives of Q . Then (I7) says that 
A' ^ is symmetric. In the case of the quartimax criterion = 
where A = C^^^) ^ ^^^^ (^T) demands that A'A be symmetric. This 
furnishes a simple test for the convergence of a quartimax algorithm. 
Corresponding tests apply to other orthogonal algorithms. 
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The Differential of an Orthogonal Algorithm 

We turn now to the problem of finding formulas for the partial 
derivatives of an orthogonal rotation algorithm h . Note first that the 
elementary skew-symmetric matrices 

(18) K(u,v) , 1 < u < V < k 

fbim a basis for X • The k(k - l)/2 by k(k - l)/2 matrix (L^.^) of 
L as defined in (9) relative to this basis taken in lexicographic order 
is given by: 

(19) L/, V . = (AK(u,v)) 

/(>^.s)J(u.v) rs' 

. - ^ iu oA. iv SxV ^ 

1-1 IV lU 

where 

(20) /(r,s) . (r - l)(2k - r)/2 s - r 

for 1 < 3: < s < k and 1 < u < v < k . 

V/hile it is not evident from our derivation and not essential to what 
follows^ the matrix (L. .) is in fact symmetric and nonnegative definite.^ 
Under the assumption that L is nonsingular^ (L. ^) is positive definite. 

^It can be shov/n that (L. .) is a matrix oT second partial derivatives 
evaluated at the minimum of an appropriate function. 



Let (L'*'"^) be the inverse of the matrix (L^^) 

(21) e. = fAr^(K(u,v))]. 

^ ^ iruv ^ ' ir 

be the (i,r) -th component of the matrix AL''''"(K(u, v)} . Using the fact 
that the (L^^) is the matrix of l""^ with respect to the basis in (l8)^ 

(22) e. = ^X., L^(''-')'^("''^) - Z ^.J(r,t),(W,^> 

iruv it . ^ It 

t=l t=r+l 

for 1 < i < P ^ 1 < r < k , and 1 < u < v < k . The sums in (l8) are 
zero when the lover ILnit exceeds the upper limit. 

Finally, using (21), the basic relation (l5) may be put in the 
coordinate form: 

k 

(25) d}v. = Z da. T - L Z e. da, T . 

Reading the partial derivatives of h from this gives: 

dh. 0* ^ 

(2i^) ^-^=6..T - Z Z e. '^r^T^ 

^ ^ oa. 1.1 sr ^ ^ iruv st 

OS *^ u<v t jt 

for 1 < i, j < p and 1 < r,s < k . iiere 5^. . denotes t. e Kronecker 

delta. 

In summary the partial derivatives of h may be computed from A , 

T , and a formula for Q as follows: 

(i) Use Q, and (l?) to obtain lormulas for the • 

(ii) Compute the values of the partial derivatives ^^j^g/^^^^ • 

(iii) Using (19) form the matrix (L .) and invert it. 

1 J 
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(iv) Compute the e^^^^ ..sing (21). 
(v) Using {2k) compute the partial derivatives of h . 

^> Orthotnax A l gorithms 

We turn now to the problem of finding specific formola^ for the 

orthomax algorithms. These algr .ithms are designed to maximize the 
orthomax criterion: 

(25) Q = J ^ ( 2: V . J ( £ i^l f) . 

r=l i=l ^ i=l ^ 

This becomes the quartimax, varimax, and equimax criterion when 7 = 0^1 , 
a; id k/2 respectively. Using (17) the corresponding constraint functions 
are: 

P ^ P Poo 

(26) ^ = E X. X. (X: - ) - 21 2: X Z (XT - x: ) 
^ ^ rs . T ir is^ ir is^ P . , ir is . ^ ^ ir is* 

1=1 ^ i=l i=.-l 

for 1 < r^s < k • Alternatively, these constraint furxtions may be 
founds i;; the 7 = 0^ 1 cases at least, by setting the rotation angle ^ 
equal to zero in the quartimax and varimax algorithms dejc»*i'c€d by Harman 

[1967^ P* 500 and p. 507]. 

The partial derivative's of the 'Ir follov easily from (PS)* For 

r s 

1 < i < p and 1 < r ^ s < k they are: 

^ = 5X^ X. « - 2- [X. z (X^ ^ X^ ) + 2X. Z X. X. 1 
oX. ir IS IS p If ^ jr js' ir jr js 

ir <}*~^ 

(27) 

is IS 
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For all other values of i , r , s , and t : 

t 

(23) . 
it 

In summary the partial derivatives of an orthomax algorithm h may 
be computed from A and T as follows: 

(i) using (27) and (28) compute the partial derivatives of ^ . 
(ii) Using (19) form (L. .) and invert it. 
(iii) Using (21) compute the ^^^^^^ * 
(iv) Use (2k) to give the partial derivatives of h • 

As observed earlier^ we must assume that L as defined by (9) is non- 
singular when A is an orthomax rotation of A . This needs to be an 
assumption for it is not alv/ays true. It is easy to sliov^^ for example^ that 
if the orthomax criterion has the same value for every rotation of A , then 
L = c and Is clearly singular. In the tv;o factor case this is the onlj' v/a^ 
in v/hich L can be singular. M:vre generally^ in the coui-se of a simulation 
study the authors have looked at thousands of randomly selected L trans- 
formations arising in q^iartiinax rotatioi.. Wot one of these was singular. 
The same was true for a smaller set of varimax rotations. There is pres- 
ently, at Least, no indication that our nonsingularity assumption will prove 
to be a practical difficulty. Tnde(^d. in the cases looked at, the matrix 

(L, .) was not only nonsingular but fairly well conditioned. 

1 J 

6* An Example and Discussion 

Laxv'lcy and Maxwell [1971^ P* 6^)] give an analysis of correlation 
Q •> f^icicr.l.- b the so of .r^.mplc -jf 292 childr-^n -,r. z:t 

ERJ.C 
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10 cognitive tests. Their unrotated naxiraum likelihood estimates for 
loadings on three factors are given in Table 1. The standard errors of 
these loadings obtained by evaluating Lawley's formulas [Lawley and 
Maxvell, 1971, p. 62] for standardized loadings (i.e., loadings computed 
from correlations) are given in Table 2. These standard errors are 
pleasingly small, ranging in value from .022 to .0^h. 



Insert Tables 1 and 2 about here 



Taming to the rotated case, Tuble 5 contains a varimax rotation 
of the loadings in Table 1 together with the transformation matrix T 
which produced them [Lawley and Maxwell, p. 75]- As can be seen from the 
matrix T, a substantial i-otation has been made. The resulting structure, 
however, is not particularly simple. The standard errors for the rotated 
•loadings in T-^ble 3 computed by using Lav/ley's formulas and ignoring the 
fact that T is computed from the data are given in Table k. We will 
refer to these as uncorrected standard errors. Again their values are 
pleasingly small, ranging from .03U to .077- Using the results developed 
here. Table 5 contains the corresponding standard errors corrected for 
sample variation in T . The values cover roughly the same range, from 
.055 to .09k. B^^fore turning to a direct comparison of the uncorrected 
and corrected standard errors, we note that what has been presented 
thus far demonstrates th- feasibility of computing standard errors for 
rotated loadings. To our knowledge this is the first time that standard 
errors for the rotated case have been presented m the literature. 



Insert Tables 5^ and 5 about here 



There are important and substantial differences between the un- 
corrected and corrected standard errors- Figure 1 is a plot of the cor- 



Insert Figure 1 about here 



rected standard errors of Table 5 against the uncorrected errors in 
Table k. The uncorrected standard errors range from hl<iy below to 70^ 
above the corrected standard errors. These differences support V/exler's 
[1968] simulation study vrhich showed large discrepancies between uncorrected 
standard errors and standard errors obtained by simulation. The differ- 
ences betvreen uncorrected and corrected standard errors may be made 
arbitrarily large by choosing the data carefully- Using artificial data 
"t.he authors have computed standard errors v/hich differ by more than pO 
fold. Because they originated from real data^ hovrever^ the differences in 
Tables h and 5 ^ire probably more relevant. It should be observed that the 
differences displayed in Figure 1 represent real theoretical discrepancies^ 
not random fluctuations. 

The standard errors presented give a simple indication of how stable 
factor loading estimates are- A quick significance test can be based on 
the rule v;hich declares an observed difference significant if it exceeds 
tv;ice the sum of the corresponding sbaadard errors- Under this rule the 
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estimates in Table 5 of X^^ aad differ significantly while those 

of aad X^^ do not. A more sensitive test could be made by taking 

account of the covariances between the factor loading estimates. 

Indeed, since the covariances are available^ familiar procedures make 
it possible to test almost any hypothesis about rotated factor loadings. 
For example^ is one variable related to the factors in the same way as 
another or^ given a sample from a second population^ does the factor 
pattern there differ from that of the present population? One may also 
produce simultaneous confidence intervals which allow him to scan across 
one or more tables of factor loadings in search of significant differences 
and account for the fact that he is scanning. 

Before attacking such generalizations^ however^ it may be wise to 
ascertain how well the asyr.ptotic results perform on finite samples. 
Preliminary work here is encouraging but only begun. Another natural 
next step is to derive similar results for the oblique case The results 
derived here could be used with other methods of extraction such as those 
of minres and alpha factor analysis except that the asymptotic covariances 
for unrotated loadings have not been derived for these methods. This pro- 
vides still another area for investigation. 
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TABLE 1 

Unrotated factor loadings for a set of cognitive tests 





Test 


I 


Factor 
II 


III 


Commonality 


1. 


Comprehension 


.788 


-152 


-.552 


.768 


2. 


Arithmetic 


.8lk 


.581 


.041 


.911 


5- 


Similarities 


.814 


-.045 


-.215 


.710 


k. 


Vocabulary * 


.798 


-.170 


-.204 


.707 


5- 


Digit span 


.641 


.070 


-.042 


.418 


6. 


Picture completion 


•755 


-.298 


.067 


.665 


7- 


Picture arrangement 


.782 


-.221 


.028 


.661 


8. 


Book design 


.767 


-.091 


.558 


.725 


9- 


Object assembly 


• 755 


-.584 


.229 


-757 


10. 


Coding 


.771 


-.101 


.071 


.610 
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TABIiE 2 

Standard errors for the unrotated loadings in Table 1 



Variate 


J 


Factor 

II 


III 


1 


.025 


.094 


.0^9 


2 


.040 


.065 


.046 


5 


.022 


.085 


.0^5 


k 


.Q>2\ 


.071 


.048 


5 


.056 


.076 


.059 


6 


.028 


.0^5 


.059 


7 


.026 




.051 


8 


.027 


.076 


.051 




.051 


.059 


.o6'4 


10 


.026 


.057 


.048 



ERIC 
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TABLE 5 

Varitnax rotation of loadings in Table 1 







factor 






Variate 


I 


II 


III 


Communality 


1 


• 759 


.529 


.28^ 


.768 


2 


.5^0 


.849 


.274 


.911 


5 


.655 


.450 


.527 


. 710 


k 


.657 


.545 


.597 


.707 


5 


.570 


.455 


.276 


.418 


6 


Mk 


.265 


.615 


.665 


7 


.485 


.552 


.561 


.660 


8 


.185 


.472 


.684 


.725 


9 


.55ij 


.209 


.75^+ 


.757 


10 


.ij09 


.1j25 


.5li) 


.610 




Rotation Matri> 


T 






.560 


.655 


.554 






-.511 


.758 


-.575 






-.768 


.155 


.622 
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TABLE k 

Uncorrected standard errors for the rotated loadings in Table 5 



Variate 


I 


Factor 
II 


III 


1 


.058 


.077 


.067 


2 


.059 


.061+ 


.0U5 


5 


.058 


.069 


• 055 


k 


• OJU 


.060 


.056 


5 


.057 


.065 


.055 


/" 
O 


.050 


.0U6 


• Ol+l 


7 


.0U2 


.0U6 




8 


.055 


.065 






.065 


.059 


.05U 


10 


.OU5 


.051 


.0U2 



ERIC 
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TABLE 5 

Corrected standard errors for the rotated loadings in Table 5 







Factor 




Varinte 


I 


II 


III 


1 


.Okl 




.056 


2 


.Q6k 


. 09h 


.056 


5 


.okk 


.Okl 


• 059 


k 




.Oh3 


.ok^ 


5 


.051 


.Qkc) 


• OSO 


6 


.Ohc) 


.Okk 


.okk 


7 


.OkB 


.Qk2 


.0U5 


8 


•055 


.050 


.ok6 


9 


.Oii7 


• OUl 


.0k2 


10 


.oii7 


.Qk-j 


.okk 
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Figure Caption 

Figure !• Corrected verses uncorrected standard errors. Each 
corrected standard error in Table 5 is plotted against the corresponding 
uncorrected standard error from Table A scale unit equals 0.01. 
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